Effective Job Execution in Hadoop Over Authorized Deduplicated Data
نویسندگان
چکیده
منابع مشابه
FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data
Big data parallel frameworks, such as MapReduce or Spark have been praised for their high scalability and performance, but show poor performance in the case of data skew. There are important cases where a high percentage of processing in the reduce side ends up being done by only one node. In this demonstration, we illustrate the use of FP-Hadoop, a system that efficiently deals with data skew ...
متن کاملHadoop performance modeling and job optimization for big data analytics
Big data has received a momentum from both academia and industry. The MapReduce model has emerged into a major computing model in support of big data analytics. Hadoop, which is an open source implementation of the MapReduce model, has been widely taken up by the community. Cloud service providers such as Amazon EC2 cloud have now supported Hadoop user applications. However, a key challenge is ...
متن کاملJob Attentive Scheduling Algorithm in Hadoop
In recent years cloud services have gained much attention as a result of their availability, scalability, and low cost. One use of these services has been for the execution of scientific workflows as part of Big Data Analytics, which are employed in a diverse range of fields including astronomy, physics, seismology, and bioinformatics. There has been much research on heuristic scheduling algori...
متن کاملFault-Tolerant Job Execution over Multi-Clusters Using Mobile Agents
AgentTeamwork is a mobile-agent-based job coordination system that targets a mixture of computing nodes, some directly connected to the public Internet and others simply clustered in a private IP domain but not managed by a commodity job scheduler. The system allows its mobile agents to carry a user job with them from the public to private IP domains as well as to form a hierarchy where agents ...
متن کاملResearch on Job Scheduling Algorithm in Hadoop
On the basis of researching Fair Scheduling Strategy deeply in Hadoop cluster,the Node Health Degree is defined by constructing the relationship function between node load and job fail rate, and a job scheduling algorithm based on Node Health Degree is proposed in this paper. Nodes are grouped, according to Node Health Degree, into three categories in order to assign corresponding job in accord...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Webology
سال: 2020
ISSN: 1735-188X,1735-188X
DOI: 10.14704/web/v17i2/web17043